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A password composition policy restricts the space of allowable passwords to eliminate weak passwords that 
are vulnerable to statistical guessing attacks. Usability studies have demonstrated that existing password 
composition policies can sometimes result in weaker password distributions; hence a more principled ap- 
proach is needed. We introduce the first theoretical model for optimizing password composition policies. We 
study the computational and sample complexity of this problem under different assumptions on the struc- 
ture of policies and on users' preferences over passwords. Our main positive result is an algorithm that 
- with high probability — constructs almost optimal policies (which are specified as a union of subsets of 
allowed passwords), and requires only a small number of samples of users' preferred passwords. We comple- 
ment our theoretical results with simulations using a real-world dataset of 32 million passwords. 
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1. INTRODUCTION 

Imagine a web surfer, an online shopper, or a reviewer in a prominent CS and Eco- 
nomics conferenc43 who logs on for the first time to a server; so that she can sign up 
for some service, place a shopping order, or view a list of assigned papers. Such a user 
registers on the server by choosing a username and picking a password. Naturally, our 
user's first attempt at picking a password is her favorite combination '123456', which 
the server declines. She then has to pick a password that follows certain guidelines: 
of suitable length, involving lower- and upper-case letters, with numbers or special 
characters, etc. Such password composition policies defend against the "first line" of 
attack - guessing attacks by uninformed attackers (attackers with no previous knowl- 
edge of the user whose account they are trying to break into). 

Password composition policies are a necessity because — without them — user- 
selected passwords are predictable. Indeed, many unres tricted u sers would select sim- 
ple passwords like '123456', 'password' and 'letmein' [ Doell 1201 21. Furthermore, this 
issue is of great importance to today's economy. Passwords are commonly used in elec- 
tronic commerce to protect financial assets. In fact, the passwords themselves have 



All three might be the same person. 
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financial value. Symantec reporte d that compromis ed passwords are sold for between 
M and $30 on the black market iFossi et al.l 12008 1. and a 2004 Gartner case study 
iWittv et al.l [20041 estimated that it cost a large firm over $17 per password-reset 
call. Nevertheless, existing password composition policies are typically not principled, 
and do not necessarily result in less common passwords. For example, studies show 
that users respond to restrictions in pr edictable ways jK omanduri et al. 20111, or pick 
weaker passwords due to user-fatigue IClair et al.ll2006 : Kruger et al. 20081. 

In this paper, we initiate the algorithmic study of password composition policies. 
Such policies restrict the space of passwords to a subset of allowed passwords, and force 
each user to pick a password in this subset. Thus, n users induce a distribution over 
passwords where for a password w, Pr[w] = ■ i picks w}\. By declaring different 

subsets of allowed passwords, different password composition policies induce different 
distributions. Our work formalizes and addresses the algorithmic problem a server 
administrator faces when designing a password composition policy; we ask: 

In what settings can the information about the users' preferences over pass- 
words allow us to design a password composition policy that is guaranteed 
to induce a password distribution as close to uniform as possible? 

We wish to stress at this point that we do not take a cryptographic approach to 
the problem: we do not design a protocol aimed at amplifying a password's strength, 
nor do we rely on standard cryptographic assumptions or techniques in designing our 
password composition policies. Single-factor authentication does not defend against 
an attacker who learns about the most probable password from an external source. 
Furthermore, because password systems often allow users multiple attempts in enter- 
ing their password, an attacker can make a small number of guesses with impunity. 
Therefore, we instead focus on the design and analysis of algorithms for optimizing 
the password composition policy's induced distribution over passwords, and in our the- 
oretical results compare the performance of our algorithm to the optimal policy among 
exponentially many potential policies in the worst case. 

1.1. Our Model 

We study the algorithmic problem of optimizing password composition policies along 
multiple dimensions: the goal, the user model, and the policy structure. 

Goal. We focus on designing a policy that maximizes the minimum-entropy of the re- 
sulting password distribution. Specifically, we assume the server deals with n users, 
each picking a password from some space of passwords V that respects the server's 
password composition policy. These n passwords form a distribution over the domain 
of all allowed passwords and our goal is to minimize the probability of the most 
likely password. This is a natural goal (see Section [7), as opposed to maximizing the 
Shannon-entropy of the distribution, which for example is still high even if half the 
people choose the same password and the other half choose a password uniformly at 
random from V. From a security standpoint, the minimum entropy represents the frac- 
tion of accounts that could be compromised in one guess. For e xample, an adversary 
would be able to crack 0.9% of RockYou passwords [Imp ervall2010 1 with only one guess. 
Alternatively, should the attacker attempt to break into only one account, the mini- 
mum entropy represents the likelihood that the account is compromised on the first 
guess. We also consider a slightly stronger goal of minimizing the fraction of accounts 
that could be compror nised using k guesses, that is, the overall probability of the k 
most likely passwords llBoztaslll99 91. 

User model. We consider two models for how users select passwords when presented 
with a password composition policy. 

EC'13, June 16-20, 2013, Philadelphia, PA, Vol. X, No. X, Article X, Pubhcation date: February 2013. 



Optimizing Password Composition Policies 



X:3 



In the ranking model, each user has an imphcit ranking over passwords, from the 
most preferred to the least preferred. Given a password poHcy, each user selects the 
highest-ranking password among those allowed by the policy. There is a distribution 
over the space of rankings that determines the fraction of users with each possible 
ranking. Note that for any password composition policy, such a distribution over rank- 
ings induces a distribution over the most preferred allowed passwords. 

In the normalization model, there is a distribution V over the space of all passwords. 
This distribution tells us the likelihood that an unrestricted user would select a given 
password. Given a password composition policy, V induces a new distribution over the 
allowed passwords (which can be obtained by normalizing the probabilities under V of 
the allowed passwords). When we ban a password the fraction of users that prefer each 
allowed password grows; the natural interpretation is that users who preferred an 
allowed password still use that password, but users who preferred a banned password 
are redistributed among the allowed passwords according to the induced distribution. 

As we show, the normalization model is strictly more restrictive than the ranking 
model: any distribution in the normalization model can be simulated in the ranking 
model, but there exist hardness results for the ranking model that do not hold for the 
normalization model. 

Policy structure. We consider the best policy that is restricted to manipulation of a 
given set of rules — each rule is simply a predefined subset of potential passwords. 
These rules are given to us as part of the problem (see Section[7]for a discussion of this 
point). If we interpret a rule as a subset of banned passwords (e.g., passwords shorter 
than seven characters), its complement (e.g., passwords of at least seven characters) 
can be interpreted as a subset of allowed passwords. As such, when we take the union 
of rules we get either a set of banned passwords (negative rules) or allowed passwords 
(positive rules); this is our password composition policy. While the distinction between 
the two cases may at first seem a mere technicality, it is in fact quite significant due to 
the following observation. If we ban the union of rules then in order to ban a password 
that was picked by too many users, we may ban any rule that contains this pass- 
word. In contrast, if we allow a union of rules then in order to ban this password we 
must not allow any rule that contains it. In other words, when our goal is to discard 
a password in the negative rules setting, we have multiple ways to do so. When our 
goal is to discard a password in the positive rules setting, we have only one way to 
do so — excluding all rules that allow this password. As we shall see, this seemingly 
small difference leads to a clear separation between the two scenarios in terms of the 
complexity of designing optimal policies. 

We pay special attention to the case where each password has its own singleton rule. 
In this setting, a policy can be interpreted as a "blacklist" of banned passwords that do 
not necessarily share common characteristics. Note that when each password has its 
own singleton rule, it does not matter whether these rules are positive or negative. 



1.2. Our Results 

As we noted above, a password composition policy induces a distribution over most 
preferred passwords (in both user models). Hence we can study algorithms that sample 
these distributions. One can obtain such samples by asking random users to choose a 
password that is constrained by a certain policy. Clearly, though, we need the number 
of samples to be "small". The size of the space of all passwords V — which we denote 
by — is typically very large (e.g., V can include all passwords that are no longer 
than 32 ASCII characters). We wish to maximize entropy using a number of samples 
that does not depend on N . 
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Table I: Summary of Complexity Results. 
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Before tackling this goal directly, we study the problem in a simpler setting where 
the preferences of all users are given to us as input (i.e., there is no uncertainty). In 
particular, here is a part of the input and algorithms are allowed to run in time poly- 
nomial in N. The computational complexity of problems in this setting informs their 
study in the sampling setting: it is hopeless to design efficient sampling algorithms for 
problems that are computationally hard, but computationally tractable problems may 
(or may not) have efficient sampling algorithms. 

Table [J summarizes our complexity results. The parameter k refers to our optimiza- 
tion target: minimizing the likelihood of the k most likely passwords. Some results are 
direct corollaries of others — using the fact that singleton rules are a special case of 
positive rules and the fact that the normalization model is a special case of the ranking 
model (see Section |2ll. Looking at the table one immediately notices a clear separation 
between negative rules and positive rules: optimization using the latter is much easier. 

We therefore focus on positive rules in our attempt to design an efficient sampling 
algorithm. Our main result is the best one could hope for in this setting. We design 
an algorithm that works in the more general ranking model, and finds a policy whose 
entropy is e-close to optimal with probability I - S, for any given e,S > 0. The required 
number of samples is polynomial in 1/e, log(l/(5), and the number of positive rules m. 
We can assume that m is small, because each rule corresponds to a subset of passwords 
that can be concisely described to users. 

These results can be applied in a practical setting, and we show this through simu- 
lated sampling experiments using natural rules and a large dataset of real passwords. 
The experimental results provide evidence for the difficulty of the negative rules set- 
ting: we search all combinations of rules to find the optimal policy and then attempt 
to discover this policy by making decisions both randomly and with a heuristic. In the 
negative rules setting, neither approach succeeded at finding the optimal policy after 
hundreds of iterations at various sample sizes, and average-case performance did not 
improve with sample size. In the positive rules setting, the average-case performance 
of our efficient algorithm improved with sample size and, with a moderate sample size, 
found policies that were either optimal or very close to optimal. 

1 .3. Related Work 

It has been repeatedly demonstrated that users tend to select easily guessable pass- 
words [Imperva 2010; Doel 2012; Bonneau 2012] and NIST recommends that organi- 
zations "should al so ensure that other trivial p asswords cannot be set," to thwart po- 
tential attackers HScarfone and Souppaval [200911 . Unfortunately, this task is more dif- 
ficult than it might appear at first. Policies were initially developed without empirical 
data to support them, since such data was not available to policy designers |Burr et al] 
1^06]. When hackers leaked the Rock You dataset to the Internet, both researchers 
(and attackers) suddenly had ac cess to password data, leading to many insights into 
true passwords llWeir et al.ll201Gn . However, recent research analyzing leaked datasets 
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from non-English speakers, notably Hebrew and Chinese-language websites, shows 
that trivial password choices can vary between contexts, making a simple blacklist ap- 
proach ineffective IBonneau and Xu 2012]. This means that, depending on the context, 
a policy based on leaked password data might provide no security guarantee, and it 
has ethical issues as well. 

To com bat this issue, researchers have turned to a sampling approach. Bon- 
neau f2012] added a system for sampling to the Yahoo! password infrastructure. This 
system allows one to gain empirical data about the frequency distribution of pass- 
words without revealing the passwords themselves. Such approaches provide a way of 
gathering empirical data about passwords while maintaining the anonymity of users. 
Our algorithms could be used in conjunction with such an infrastructure to optimize 
policies. 

Komanduri et al. 11201 111 studied the effectiveness of several basic password compo- 
sition policies by using Amazon's Mechanical Turk to conduct a large scale user study. 
They found that people often respond to restrictions in predictable ways (e.g., if the 
password needs to contain a capital letter users might tend to capitalize the first letter 
of a password) and provide very general recommendations for password composition 
policies. However, no theoretical model has been proposed for studying the password 
composition proble m. 

Schechter et al. floTO^ suggest using a popularity oracle to prevent individual pass- 
words that have been used too frequently from being se lected by new users. They also 
proposed using the count-min sketch data structure fCormode and Muthukrishnan] 
[20051 to build such a popularity oracle. Malone and Maher [2012] suggest a sim- 
ilar system using a Metropolis-Hastings scheme to force an approximately uni- 
form distribution on passwords. Usability results on the effectiveness of dictionary 
checks IKomanduri et al. 2011] suggest that such policies would be very frustrating 
since the policy is hidden from users behind an oracle. In contrast, we seek to con- 
struct optimal policies from combinations of rules that are visible to the user and can 
be described in natural language. 

This consideration of users is important to electronic commerce, even where security 
is concerned. Florencio and Herley 12010] studied the economic factors that drive in- 
stitutions to adopt strict password composition policies and find that they often value 
the user experience over security. An e-mail provider like Yahoo! might adopt simple 
composition policies because a frustrated user could easily switch to Gmail, while uni- 
versities are free to adopt strict policies because users cannot switch easily. 

2. A MODEL OF PASSWORD COMPOSITION POLICIES 

We use V to denote the space of all possible passwords. N = \V\\s, used to denote the 
total number of passwords. We denote the number of users by n. 

A password composition policy may be specified in terms of rules. A rule is a subset 
of passwords R QV (e.g., the set of all passwords with more than seven characters). 
We use ...,Rm to denote a list of rules that may be active or inactive. We consider 
two schemes. 

— Positive Rules: A password w is allowed if and only if it is allowed by some active pos- 
itive rule. Formally, a password composition policy As ~ \Jies specified by a set 
S C [m] = {1, to} of active rules. In this setting rules should consist of sets of pass- 
words which we expect to be strong (e.g., i?j might be the set of all passwords longer 
than 10 characters, or the set of all passwords that use both upper and lowercase 
letters, or the set of all passwords that do not include a dictionary word). 

— Negative Rules: A password w is allowed if and only if it is not contained in any active 
negative rule. Formally, a solution As — {w &'P\w ^ Uies given by a subset 
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S c [rn] of active rules. A negative rule should consist of passwords that we expect 
to be weak (e.g., Ri might be the set of all passwords without an uppercase letter, or 
the set of all passwords shorter than 6 characters, or the set of all passwords that 
include a dictionary word). 

We also consider the special case of singleton rules, where our rules are 
{wi}, . . . ,{wpf}. Equivalently, we are allowed to ban or allow any individual password. 

We use Pt[w \ A] to denote the probability of a password w given composition policy 
A. For w ^ Awe have Pt[w | ^] = 0. Given a set C ^ we will also use Pi[W \ A] = 
J2wew Pr[w I A]. We use p {k, A) = max^rc A:\w\=k Pr[H^ | A] to denote the probability of 
the k most popular passwords. Intuitively, p (fc, A) represents the probability that an 
adversary can successfully guess a password using k attempts. To avoid cumbersome 
notation we sometimes use pi = p{l,A) to denote the probability of the most popular 
password. Similarly, we use p2 (resp., pk) to denote the probability of the second (resp., 
k'th) most popular password. 

We consider two user models that determine how users choose passwords under a 
given password composition policy. 

— The ranking model: A ranking is simply a permutation of P, which represents a user's 
password preferences. It can be represented using an ordered list £i = wi^i, w^Xj 
user i prefers password wj^i to for all j. The ranking £j naturally tells us which 
password i will pick under any composition policy A. Specifically, i will use password 
WAA = where j = argminjt : wt^i e A}. Given a distribution V over rsmkings, we 
have 

Pr hv \A] = Pr \wA i = w] . 

— The normalization model: Let V be an initial distribution over P, and let Pr[w] = 
Prx~i> [w = a;] . If we select the composition policy A then the probabilities of all w e ^ 
Eire simply re-normsdized so that 

yw€P,Acp,Pv[w[A] = ^^y 

Clearly it holds for both models that the probability of an allowed password mono- 
tonically increases as one bans more passwords. Formally, for aH w G A and B C P 
such that w ^ B we have 

Pr[w|yt] < Pr [w\A\B] . (1) 

Another important observation is that for our purposes the ranking model is more 
general than the normalization model. Indeed, we argue that a distribution V over 
passwords in the normsdization model induces an equivalent distribution over rank- 
ings. To generate the most highly ranked password, draw a password wi from V. Next, 
let Al = P \ {wi}, and draw the next most preferred password W2, where W2 ~ w with 
probability Pv[w \Ai]. In the following round we ban W2 to obtain a policy A2, and so 
on, until all passwords have been banned. 

Given k £ N, our goal is to find S C [m] such that p{k,As) < p{k,As') for all 
S" C [m] . When fc = 1 this goal is equivalent to maximizing the minimum entropy. If 
p (fc, As) < c-p (fc, As')+£ for all S" C [m] then we say that 5 is a (c, e)-approximation. To 
simplify notation we sometimes use c-approximation instead of (c, 0)-approximation. 

3. RANKING MODEL: COMPLEXITY RESULTS 

In this section we consider the complexity of finding the optimal password composition 
policy in the more general rsmking model when the organization is given complete in- 
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formation about users' preferences. Specifically, the organization is given the rankings 
^1, ^„ of every user. 

Our first result is for the positive rules setting. Given positive rules Ri, ...,Rm we 
show that p (k, As) can be computed efficiently for constant values of k (see Theorem 
13.21 1. In fact, for the special case k = 1 we present a very simple algorithm that suffices. 
Both algorithms can be easily extended to the less general normalization model. Our 
algorithms are based on three simple ideas: (1) Reduced Preference Lists — each pref- 
erence list £i can be efficiently reduced to a short (length < to) preference list £i. (2) 
Guess and Check — start by guessing the 'structure' of the optimal solution and find 
the resulting solution. (3) Iterative Elimination — find the most popular password w 
and eliminate all positive rules that contain w. Our sampling algorithms are based on 
the same core ideas. 

Unfortunately, the picture is different in the negative rules even when fc is a con- 
stant. Given negative rules Rm we show that it is hard to even n^/^-approximate 
p {I, As)- Also, for non-constant values of k we show that it is hard to compute p (fc, As) 
in the singleton rules setting, which immediately implies hardness in both the positive 
rules setting and in the negative rules setting. Given a stronger complexity assumption 
known as the Unique Games Conjecture l,Khot,20021 it is also hard to co-approximate 
p{k,As) in the singleton rules setting for some constant cq. However, our hardness 
results do not rule out the possibility of a c-approximation for a larger constant c. 

3.1. Positive Rules: Efficient Algorithm for Constant k 

We first show that p (fc, As) can be computed efficiently for constant values of fc in the 
positive rules setting. In this section the organization is given positive rules Rm 
as well as preference lists ^i, ...,£„. We assume that the organization can efficiently 
query the preference lists (e.g., given S C [to] the organization can efficiently find 
£i (As) — user i's preferred password given policy As)- 

We elaborate on the key algorithmic ideas listed above. First, we can efficiently re- 
duce each preference list £i to a list of at most to passwords (Claim [HTTj l. While the 
reduced list ii is much shorter than (i it is still sufficient to determine user i's pre- 
ferred password given policy As for any S c [to] . We use V to denote the reduced space 
of potential passwords. 



Algorithm 1 Reduce 
Input: 

Preference List: i 
Positive Rules: Rm 

Initialize: i ^ 0, 5*0 <— [m], ^ <— empty ranking, 
while S'j ^ do 



i i ^+ 1 
return £ 



Claim 3.1. Algorithmll\makes at most m queries to I and m? membership queries 
and outputs a reduced preference list I over at most m passwords such that for every 
S c [to] it holds that i (As) = i (As). 



hetwhe i{Asi). 

£ ^ {£, w) 

Si+i ^ SiXiJlw e Rj} 



t> Append' the current most preferred password to £ 
> Deactivate all rules that contain w 
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Proof. Clearly, the algorithm's main loop iterates at most m times because for 
each i we eliminate at least one rule (e.g., |5't+i| < \Si\), so the bound on queries and 
the length of i are immediate. (Because we assume that we can query I efficiently 
Algorithm[l]is also efficient.) By construction we have i{Si) — i{Si) for each Si. Fix any 
5 C [m]. Let Si be such that S C Si yet S % S^j^x and let Wi be the most preferred word 
in I out of all words in UjgSi it is the case that e Ujes ^j ' then is the most 

preferred word in S too and we're done. Otherwise, e Uje5,\5 which means that 
removing the set {j e S^ : Wi e Rj} creates a set S^+i s.t. S C S^+i, contradiction. □ 

Second, the "guess and check" idea means that our algorithm starts by guessing 
what the optimal solution looks like (e.g., what the k most popular passwords will be 
in the optimal solution and what the probability of the fc'th most popular password is). 

There are at most (mn)'^''^' potential solutions to brute-force try. As we show, for each 
solution, it is easy to figure out which sets must be eliminated. 



Algorithm 2 GuessAndCheck 
Input: 

Preference Lists : £i,...,£n 
Positive Rules: Rm C V 

Integer k 

Initialize: Candidates ^0 t> Candidate Solutions 

for i — 1 ^ n do 

£i i- Reduce {£i,Ri, Rm) 
V <— U"^j li. > Reduced Password Space 



Theorem 3.2. Algorithm^runs in time polynomial in n^, m'' and outputs a set of 
positive rules S C [m] of positive rules such that 



for every other set S' C [m]. 

Proof. It is evident that the running time of the algorithm is poly(n'^, m*^) since we 
only have 0{{nmY) potential solutions to try. 

Let As* denote an optimal solution and let G* denote the k most popular passwords 
in this solution. Suppose we start with the correct guess (G = G* and p is the probabil- 
ity of the /c'th most popular password), then we claim that our algorithm must produce 
the optimal solution. In particular, we maintain the invariant that As* C Asg,p until 
we converge to the optimal solution. Clearly, this is true initially — before we have 
eliminated any passwords. 

Suppose that the invariant holds and that our algorithm bans a password w e 'P \ G 
by deactivating all rules in Sc.p that contain w. Then by the definition of our algorithm 




Sg,p ^ Sg,p \{j \ w e Rj} > Ban w because it is inconsistent with guess 

if Pr [w I AsgJ < P for all w 6 {Asg,^ \ G) then 




p{k,As) <p{k,As') 
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we must have Pr [w | ^sc p] > p-ltw e As* then by Equation ^ we have 

Pt[w\As'] > Pr hl^Sc.p] >P, 

which contradicts the choice of G. Therefore w ^ ^5. , so all rules that contain it are 
deactivated in and the invariant still holds. By definition Algorithm |2] terminates 
when every password w e Asa,p \ G has probability at most p. Because our invariant 
still holds we can apply Equation ^ again to get 

Pr[G\AsaJ<Pr[G\As']^p{k,As') ■ 

Hence, ^sg.p optimal solution. □ 

For the special case fc = 1 the simple algorithm IterativeElimination (Algorithm [3) 
suffices. The basic idea is very simple: iteratively eliminate the most popular password 
w by deactivating all positive rules that contain w. We repeat this process until no 
passwords remain. We claim that one of the solutions along the way was the optimal 
solution. 



Algorithm 3 IterativeElimination 
Input: 

Preference Lists : ii,...,£n 
Positive Rules: Rm ^ V 

Initialize: 5*0 <- [m], i <— 
while S^^9do 

w (Si) <r- argmaxjPr [w \ Asi] \ w e As,} > w (Si) is most popular allowed pwd 
Si+i <^ Si\{j \ w (Si) e Rj} > Deactivate all rules that contain w (S^) 

i ^ i + 1 

return Si* where i* ^ arg mini P (1, Asi ) 



Theorem 3.3. Algorithm^outputs a set of positive rules S C [m] such that 

V5'C [m], p{l,As)<p{l,As') ■ 

Proof. Let T denote the optimal policy. Clearly if T = [m] then our algorithm 
returns S* ^ T because that is the first set we try. Otherwise, T C [m]. Let S be the 
last set our algorithm considers that has the property that T C S. Again, if T = S*, 
our algorithm returns S. Let w{T) be the most popular word in At, and because of 
optimality Pr[w(T) | At] < Pt[w{S) \ As]. 

Now, because we modify S to not contain T in the next iteration, then the most 
popular word in S, w{S) has to belong to some rule Rj where j e T. Therefore 'w{S) e 
Uj6T the definition, the most popular word in At satisfies Pr[w{T) \ At] > 

Pr[u;(5) I At]. 

But observe, because w{S) e [jjer ' ^® must have that w{S) is at least as popular 
in T. Indeed, if ^ is a preference list where we disallowed \ Ujes Rj and the most 
preferred word is w{S), then as long as we disallow more words but keep allowing w{S) 
the word w{S) remains at the top of the list. Therefore, Pr[ii;(5) | At] > Pt[w{S) \ As]. 
Combining together all inequalities we get Pt[w{T) \ At] = Pt[w{S) \ As], which means 
our algorithm returns S* = S. □ 
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3.2. Singleton Rules: Hardness for Large k 

Now we tu rn ou r attention to the problem of optimizing p{k,As) for large values of 
k. Theorem 13.41 says that unless P = NP no polynomial time algorithm can compute 
p{k,As) even with si ngleton rules. If we are willing to make the Unique Games Con- 
jecture (UGC) |l Khotil2QQ2 1 then it is hard to even cq -approximate p{k,As) for some 
constant cq. These results immediately imply hardness in both the positive and neg- 
ative rules setting because these settings are a generalization of the singleton rules 
setting. 

Theorem 3.4. Unless P = NP there is no po\y{k,n, N)-algorithm that gets as in- 
put an arbitrary set of n preference -lists £i, £n over V and an integer k, and outputs 
the optimal p{k, A) in the singleton rules setting. 

Proof. We prove the theorem using a reduction from the Vertex-Cover problem. 
Given a graph G over g vertices and e edges and an integer t, we first define 

r = {wu : ueV{G)}U{w^^, : {u,v)eEiG)} 

and observe that \P\ = g + e. We also construct the following n — 2e preference-lists, 
where for every edge (u, v) e E{G) we have the two lists: 

: ^u,v 1 • • • 
: ^u,v 7 • • • 

where the choice of passwords below position 2 is arbitrary, but both rankings must be 
identical from position 2 onwards. Finally, we set k~g + e — t — I. 

Given a policy c P, we denote all banned words as B = V \ A. We denote by 
Lb as the set of words that at least one user ranks first after banning all words in B. 
Observe, i0 ~ {wu ■ u e V{G)}. Using this notation, we show this reduction indeed 
proves A^P-hardness. 

First, suppose G has a vertex cover C of size < t. Then by banning all passwords 
B = {wy : V e C} we now have Lb ^ V \ B, because for every (u, v) e E{G) either 
Wu or Wy are banned, so the word Wu,y appears at the top of at least one of the two 
lists {tu,v,£v,u}- Therefore, the n preference-lists induce a distribution whose support 
contains g + e — \B\> g + e — t words, thus p{g + e — t - \,A) < I. 

Conversely, suppose all vertex covers of G are of size at least < + 1. Let A be any set of 
banned words. Clearly, \i\B\ > < + 1 then the distribution induced by the n preferences- 
lists has support of size at most g + e - t - I, which means that p{g + e - t - l.B) ^ I. 
Otherwise, \B\ < t, and we denote the set of vertices G = {v : Wy ^ B}. Observe, 
since any vertex cover of G must contain >t + \ vertices, then there has to be at least 
t + \ - \C\ edges that C does not cover (since we can always complete C to a vertex 
cover by adding one vertex from each uncovered edge). Therefore, there have to be at 
least i + 1 - |C| words that do not appear at the top of any preference list. We conclude 
that the distribution induced by the n preference-lists has a support of size at most 

\Lb\ - .g - |C| + e - (t + 1 - \G\) <g + e-t-\ 

thus p(.g + e - t - 1, ^) = 1. □ 

From the same reduction described in Theorem l3.4l we get JJGC-hardness of approx- 
imation. While there are s ub-exponential time algorithms to solve the Unique Games 
problem llArora e t al.l 12010 1. there are no known polynomial time algorithms. Many 
famous approximation hardness results are based on the Unique Games Conjecture 
(e.g., 2 — e hardness for vertex cover llKhot and Regev 2008]). Our reduction relies on a 
result in MAustrin et al.ll201lll . which says that vertex cover is hard to approximate up 
to a (say) 1.5-factor even on bounded degree graphs. Because we start with a bounded 
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degree graph we can argue that each password in our reduction appears at the top of 
at most d preference-hsts for some constant d. See the appendix for a formal proof 

Theorem 3.5. There exists a constant c > 1 such that it is UGC-hard for a 
poly(n, N, k)-time algorithm to c-approximate the optimal p{k, A) in the singleton rules 
setting and the rankings model. 

3.3. Negative Rules: Hardness of Approximation for A; = 1 

We next turn to negative rules, where we show that the problem is extremely difficult 
even for k — 1. Though the proof appears in the appendix, it is quite interesting and 
we encourage the reader to take a look. 

Theorem 3.6. Let e > 0. Unless P = NP there is no polynomial time algorithm (in 
N,n,m) that approximates mmsc[m]Pi^T^s) to a factor ofn^^^^'^ in the negative rules 
setting and the rankings model. 

4. NORMALIZATION MODEL: COMPLEXITY RESULTS 

In this section we focus on complexity results for the normalization model. Here the 
structure of the input to our problem is a bit different: For each password w eV we are 
given the probability Fr[w] that w is selected by a random user when A = 'P. Note that 
now we can give the distribution explicitly because it requires N numbers (whereas 
a distribution over rankings requires A^! numbers). This distribution induces a distri- 
bution over V for any password composition policy A by normalizing probabilities, as 
explained in Section |2l 

Because the normalization model is a special case of the ranking model our algo- 
rithms for the ranking model can also be applied in the normalization model. The 
question is whether or not the hardness results carry over. 

We first consider the singleton rules setting with large k, and show that that we 
can compute argminj^cv P (k, A) in polynomial time in N (Theorem 14. ID . This result 
separates the normalization model from the ranking model (e.g., compare Theorems 
I4.1l and l3l4l ). However, it does not extend to the positive rules setting. In fa ct, we show 
that optimizing p (fc, ^5) is NP-Hard when fc is a parameter (Theorem l4.4l l. 

With negative rules Ri,..., R„i we show that it is hard to co-approximate 
argmax5c[m] P (1, "^s) (Theorem I4.2I >. However, we cannot rule out the possibility of 
an efficient c-approximation algorithm for some constant c in the normalization model 
(recall that Theorem l3.6l ruled out the possibility of a c-approximation algorithm in the 
ranking model for any c). 

4.1 . Singleton Rules: Efficient Algorithm for large k 

We present SortAndOptimize — an efficient algorithm to optimize p (fc, A) in the sin- 
gleton rules setting for any value of fc. The key intuition behind our algorithm is that 
if £ "P is the most likely password then wi will remain the most likely allowed pass- 
word unless we ban it — a property that does not hold in the rankings model. A formal 
proof of Theorem 14 . 1 1 can be found in the appendix. 

Theorem 4.1. For every fc, Algorithm^computes argmin^p (fc, A) in the singleton 
rules setting of the normalized probabilities model, in time 0{N log{N)). 

4.2. Negative Rules: Hardness ior k = I 

We next prove an inapproximability result that is somewhat weaker than the one that 
we obtained for the more general ranking model. 
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Algorithm 4 SortAndOptimize 



Input: 

Password space V and a probability distribution over V. 
Integer k. 

Sort the words in V from highest to lowest probability, wi,w2, - ■ ■ ,'wn- 
return the set Ai — {wj : j > i}, where i minimizes the ratio 



Theorem 4.2. There exists some constant cq > 1 such that unless NP = BPP 
no polynomial time algorithm (in n, N, m) can CQ-approximate mingcfm] P (1: -^s) in the 
negative rules setting and the normalization model. 

We will require the following construction; the proof is given in the appendix. 

Lemma 4.3. Fix m and s such that rn > s. There exists a domain D of size 
6(s^ log(m)) and a family ofm sets, Fi,F2, . . . , F,n C D, such that each set in the family 

contains ^ elements, and for every C C [m] of size \C\ < s, we have that the size of 

the union 

lUec^^l ^ ■'S^^- This domain can be constructed in randomized poly(s, m) 

time. 

That is, each set in this family contains exactly the same fraction of the domain, 
and furthermore — any union of |C| < s sets has the property that its cardinality is 
proportional to ( | C | ) | | . 

Proof OF Theorem [121 We reduce from Set-Cover — one of the classic NP- 
Complete problems IKaria I1972n . We are given sets Si,...,Sm c U, universe U = 
{1, ...,g}, and an integer t < m, and we are asked whether there is a set C C [m] of 
size < t such that U — Uiec 

It is a known fact that there exist Set-Cover instances, with (g, m, t) all polynomially 
dependent of each other, that are hard to approximate to a factor of clnn |Alon et al] 
|2006]. That is, on this particular family of instances, it is iVP-hard to distinguish 
whether there exists a cover of size t or all covers have size (1 — e)c • < In n. 

We now describe the reduction. Given a (g, ?7i, <)-Set Cover instance, we set s = 
c ■ t\ng = 9 (tint) and construct a domain D and m sets Fi,F2,. . . ,F„j C £) as in 
Lemma 14.31 We then create the following password-banning instance. First V is the 
union of D with additional disjoint g words denoted wi, ...,Wg. Now, for each set Si in 
the Set-Cover we add a rule where Ri = {wj jjes. U Fi. Finally, we set the words' 
probabilities as follows. Fixing some arbitrarily small S > 0, we set for every i the 
probability Pr[?iii] = and for every x e D we set the probability Pr[a;] = J^. 

Without loss of generality we can assume that \D\ > lOOg (because, for exam- 
ple, we can take lOOg copies of the original D). Therefore, any policy that bans all 
of {wi,W2, ■ . -Wg} yet leaves a constant (say > 1/10) fraction of D has pi < 10/\D\, 
whereas any policy that keeps even one of the words in {wi, k;2, • • ■ , Wg} haspi > l/{2g). 
Therefore, if the Set-Cover instance has a cover of size < s — Q{t\ng), then a cq- 
approximation of the optimal banning-policy must find a cover for {wi,w2, . . . ,Wg}. We 
will assume from now on that our Set-Cover instance is such that it has a cover of size 
< s. (Indeed, if s > tlog(t) then the instance is no longer A^P-hard, since the greedy 
algorithm must return a cover of size > t log(<) which causes us to deduce that the 
optimal cover must have size > t.) 
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So now, suppose our Set-Cover instance has a cover of size t. Then the respective 
union of rules bans every password in {wi,w2, . . . ,Wg} and no more than ^ \D\ words 
of -D (we get an upper bound by multiplying the size of each set by the number of sets). 

This leaves a collection of (l - ^) |D| equally likely words, sopi = (l-^) ^ \D\^^ = 
(1 - 0(l/log(5)))-i|D|-i = (1 + o{l))\D\-\ In contrast, if all covers of our Set-Cover 
instance have size s' > c-t \ii{g) (where, because we assume some cover has size < s, we 
have s' < s,) then any collection of rules that bans all words in {wi,W2, . . . ,Wg} must 
also ban at least §^\D\ words out of D. This leaves at most (1 — r2(l))|£)| words in D 
and sopi > {I - fl{l))^^\D\^^. Denoting the latter constant as c^^, we have that any 
Co — e approximation of the optimal banning-policy indicates the existence of a cover of 
cardinality < c - 1 ln(,g). □ 

4.3. Positive Rules: Hardness of Approximation for Large k 

While we can show that it is possible to optimize p (fc, A) in the singleton rules setting 
our result does not extend to the more general positive rules setting. We are able to 
show that it is NP-Hard to compute argmin5c[m] P (k, As)- However, our reduction does 
not imply approximation hardness so we cannot rule out the existence of a PTAS. 

Theorem 4.4. Unless P = NP there is no polynomial time algorithm (in N, m, n) 
which outputs argniin5c[m] P {k, As) in the positive rules setting and the normalization 
model. 

The theorem's proof is relegated to the appendix. 
5. EFFICIENT SAMPLING ALGORITHMS 

In a sense, our complexity results are not "realistic", and in particular in the ranking 
model our positive algorithmic results assume access to each user's full preferences. 
Moreover, some algorithms are allowed to run in polynomial time in the number of 
passwords A^, which can be huge. In this section we use our complexity results as 
guidelines in the design of practical sampling algorithms. 

In more detail, we are given oracle access to rules i?i , i?,„ (e.g., we can ask whether 
or not a password w e i?, ) and we are allowed to sample from the distribution induced 
by the password composition policy ^5 for any S C [m]. Less formally, a sample is 
equivalent to asking a random user what her favorite password is given the current 
policy. 

We will work in the more general ranking model, so there is essentially only one 
positive result we can build on: Theorem |3.2[ a polynomial time algorithm for constant 
k in the positive rules setting. When adapting this algorithm to the sampling setting, 
we cannot expect it to work perfectly due to the inherent uncertainty of this domain. 
Instead we expect the algorithm to find an e-optimal password composition policy with 
probability at least I - S, for any given e and 6. Crucially, the number of samples must 
not depend on the number of passwords N, and must have a polynomial dependence 
on the other parameters. 

Formally, we let S* c [to] denote the optimal collection of positive rules to activate 
(for all S c [to], p (1, As') < p {I, As))- Our goal is to find a (1, e) -approximation S c [to] 
to p (1, .4s. ), that is, S such that p (1, As) < p (1, -45- ) + e, with probability 1-6. 

We first present Algorithm [5] that achieves our goal for k = 1; this algorithm is an 
adaptation of Algorithm [3l 

Theorem 5.1. Algorithm |5] runs in polynomial time in m,l/e,l/S, requires 
O [m\og{m/5) /e^) samples and returns a {l,e)-approximation S C {1,...,to} of 
p {l,As') with probability at least 1 — S. 
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Algorithm 5 SampleAndEliminate 



Positive Rules: Ri,..., Rm 
Input: e, 6 

Initialize: 5*0 ^ [to] , i 
while 5*4 ^ do 

Sample: Draw samples wi, according to the distribution Pr [w \ Asi] 

W -f- {Wi, ...,Ws} 

Sw ^ \{j \wj = w}\ for each w eW. 

w* ■(— arg max {s^ \ w G W} > w* is the most frequently sampled password 

Vi <— ^ i> Pi is our estimation of Pr [w* \ As^] 

if Pi < e/2 then return Si > The current solution is already sufficiently good 
else 

Si+i Si — {j \ w* ^ Sj} > Deactivate all rules that contain w* 

i i + 1 

return Si* where i* = arg max {pj\i < m} . 



Proof. Let 



BADi = G As, 



s 



Pr[w|-4sJ >e/2| , 



denote the event that our probability estimates are off during iteration i. Claim 
bounds the probability of any bad event. The proof of Claim 15.21 can be found in the 
appendix. The proof involves bucketing the passwords based on their probability, ap- 
plying Chernoff Bounds to upper bound the probability of a bad estimate for our pass- 
words in each bucket, and repeatedly applying union bounds. 

Claim 5.2. Pr pi, BADi\ < 6 . 

For the rest of the analysis we assume that no bad event occurs. Let p* = 
ininsc[m] P (1, ^s) and suppose that As* C As^. Clearly, this is true when i = 0. If 
Pi > e/2+p* then Pr [w* \ As'] > Pr [w* \ Asi] > P* so that w* ^ As-. Hence, As* C As,^, 
and the property is maintained for at least one more iteration. If instead pi < e/2+ p* 
then we have pi* < Pi < p* + e/2 so for each w e As,, we have Pr [w \ As^* ] < p* + e- We 
conclude that the solution Si* is a (1, e)-approximation. □ 

We next explain how to extend Algorithm |2] to (1, e)-approximate the optimal 
p{k, As) for any constant k. 

Theorem 5.3. There is an algorithm which runs in polynomial time (in to, 1/e, S), 
takes a polynomial number of samples, and returns a {I, e)- approximation S C [m] of 
p (fc. As* ) with probability at least 1 — S. 

Proof sketch. To extend Algorithm [2] to (1, e) -approximate p [k, As) for constant 
k we need one more idea. We cannot simply obtain a reduced password space P by 
reducing preference lists because we can only sample from our distribution. Notice 
that for any 5* C [m] such that i e 5 we have Pr [w \ As] < Pr [w | A{i}\ so to obtain a 
(1, e)-approximation it is sufficient to limit our attention to passwords in the following 
set 



P = 



3i,Vv 
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We can obtain a superset of P by sampling. For each positive rule Ri we draw s inde- 
pendent samples from the distribution A^^j and set 

Intuitively, a password w is included in if and only if our estimated probabil- 
ity is sufficiently large. Let T ~ Ui^j- ^ sufficiently large sample size s ~ 
O [poly (to, fc, 1/e, 1/(5)) we can apply Chernoff Bounds to argue that with probability 
1-6{1) \T\ is small, i.e., O {poly (to, fc, 1/e, 1/(5)), and (2) T D P. □ 

6. EXPERIMENTS 

To demonstrate how our ideas could apply in a real-world scenario, we simulated 
runs of A lgorithm JH by sampling with replacement from the RockYou leaked pass- 
word set I Impervall2010t1 . The set contains over 32 million passw ords with a frequency 
distribution similar to that of many other password sets HBonn eau 2012]. Note that 
all results presented here are limited by the dataset and assume the normalization 
model. Working in the normalization model is crucial because we cannot ask the Rock- 
You users for their preferred password under a specific policy; an initial distribution 
over V — which is available to us — is sufficient though, because it induces a distribu- 
tion for any policy A. 

We selected 21 positive rules that mirror commonly used password composition rules 
that are used in practice, and looked at sample sizes s of 100, 500, 1000, 5000, and 
10000. The rules included length requirements, character class requirements, combi- 
nations of requirements, a dictionary check, etc. (See Appendix[C]for a complete listing 
of the rules we selected.) For each run with a particular value of s, the algorithm re- 
turns a policy As for which we can measure p (1, As) in the original dataset and com- 
pare with the optimal p (1, As--), determined from running Algorithm [3] on the original 
dataset. We performed 500 runs for each of the five values of s. 

To gain an understanding of how policies based on negative rules perform, we took 
the complement of the 21 positive rules selected above to get 21 negative rules. We then 
determined the optimal negative rules policy by calculating S* = argmingcpm] P (1, As) 
via brute-force. This was required because we have no equivalent to Algorithm [3] for 
negative rules. With this baseline in hand, we designed two naive algorithms, similar 
in spirit to Algorithm^ There are multiple ways to discard a password in the negative 
rules setting, and one algorithm makes this decision randomly while the other bans 
the smallest subset as determined from the current sample. Again, 500 runs were 
performed for each s e {100, 500, 1000, 10000, 50000}. 



Table II: Baseline probabilities for the RockYou dataset 



Baseline 


p(l,As) 


5 


Mean across negative rules policies 


1.3x10"^ 




Mean across positive rules policies 


1.0x10"^ 




All passwords allowed (no policy) 


9.2x10"^ 




One positive rule (5 G {1, m}) 


6.8x10"* 


8 chars, 1 upper, 1 digit 


Optimal policy with positive rules 


4.4x10"* 


14 chars OR 2 symbols OR 8 chars, 1 
upper, 1 digit 


Optimal policy with negative rules 


1.4x10"* 


10 chars AND 2 digits AND 1 symbol 
AND 1 lowercase AND not in dictio- 
nary 
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Table III: Performance of Sampling Algorithms with Positive Rules 



Sample Size 


mean p (l,As) 


minp(l, As) 


% Optimal 


100 


6.8x10--' 


1.2x10-^ 




500 


9.7x10-* 


4.4 X 10-* 


2% 


1000 


9.5x10-" 


4.4 X 10-* 


10% 


5000 


6.0x10-* 


4.4 X 10-* 


14% 


10000 


5.7x10-* 


4.4 X IQ-* 


19% 



Table IV: Performance of Sampling Algortihms with Negative Rules 





Random Decision 


Ban Smallest 


Sample Size 


mean p (1, As) 


minp(l,^s) 


mean p (1, As) 


min p (1, As) 


100 


6.8x10-'' 


1.2x10--' 


7.2x10--' 


2.3x10--' 


500 


4.4x10-'' 


6.3x10-* 


9.0x10--' 


2.3x10--' 


1000 


4.3x10--' 


4.5x10-* 


8.6x10--' 


2.3x10--' 


5000 


6.3x10--' 


4.5x10-* 


9.2x10--' 


9.2x10--' 


10000 


7.2x10--' 


4.5x10-* 


9.2x10--' 


9.2x10--' 



6.1. Baselines 

We examined several baselines for comparison with our algorithm. Table [U shows 
these baselines, the probability of the most frequent password in the resulting policy, 
and the optimal policy as a union or intersection of rules (for clarity, the complement 
of the union of negative rules is shown as the intersection of positive rules). 

As shown in Table Ullfrom the means across policies, randomly selecting a policy from 
the power set of rules can be worse than having no policy. The "one rule maximum" 
baseline was selected because, if decided based on sampling, only m distributions need 
be sampled. Our efficient algorithm requires the same amount of sampling, but can 
find the optimal policy over S C [m] rather than S G {1, m}. Also of interest is the 
optimal policy with negative rules, which is over 3x better than the optimal policy 
with positive rules. However, as shown in the following section, the performance of 
our sampling algorithms with negative rules was far worse than in the positive rules 
setting. 

6.2. Performance 

In the positive rules setting (see Table lllll l. the algorithm performed extremely well 
even at moderate sample sizes. The average policy selected with s = 500 was almost 
lOx better than having no policy. At s = 1000, the optimal policy was found 10% of the 
time (50 out of 500 times). 

In the negative rules setting (see Table [TVll, however, neither algorithm found the op- 
timal policy. The "Ban Smallest" heuristic, when faced with a choice between multiple 
subsets that contain the most likely password, decides to ban the smallest available 
subset, disrupting the space the least. This might seem like an intuitively good choice 
but, in fact, it fails to find a better policy than the empty set at large sample sizes. 
The randomized algorithm does better (it cannot actually do worse) but still has much 
worse average case performance than using our efficient algorithm with positive rules. 

7. DISCUSSION 

We conclude by discussing some key points. 

Where do the rules comes from? Throughout the paper we have assumed that 
the rules (whether positive or negative) are given as part of the input; it is not up 
to us to find these rules. Our experiments indicate that a collection of intuitive and 
practical rules can already give very good results on real data. However, the question 
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of deciding which rules should be added to our collection is outside the scope of this 
paper. Much like the problem of feature selection, it is an interesting problem with 
real-life implications, which we suspect will be very difficult in practice. 

Alternate policy goals. Our goal llBoztasI 1 199911 has been to minimize p{k,As)- In- 
tuitively, p{k,As) represents the probability that an adversary with no background 
knowledge can successfully guess the password of a randomly selected user in k tries. 
A small value of k optimizes security guarantees against an online guessing attack in 
which the adversary is locked out after k failed attempts to login. A much larger value 
of fc (e.g., 2'^^) is necessary to optimize security against an adversary who has obtained 
the cryptographic hash of a password and is able to mount a brute-force dictionary 
attack [Seeley 1989]. However, the optimal solutions for p (1, As) andp (2^^, As) might 
be completely different. One stronger goal that we might hope to achieve is to optimize 
both goals simultaneously. More formally, can we find a policy S C [m] such that for 
every S' C [m] and every k < N we have p {k, As) < c ■ p \k, As') for some constant c? 
Unfortunately, the answer is no. For any constant c this universal approximation goal 
is impossible to satisfy in the ranking model ( see TheoremlB.ll l. 

Other na tural goals incl ude a-work factor llPliamI 12000(1 and a refinement called a- 
guesswork llBonneaij|2012[1 (e.g., maximize the total number of guesses needed to com- 
promise Qf-fraction of the accounts). While a-guesswo rk is a n useful metric to analyze 
the security of 70 million Yahoo passwords [Bonneau 2012], it may not be a desirable 
optimization goal for the organization because it might allow the adversary to crack 
up to a - e-fraction of the accounts with relatively few guesses. 

Another interesting direction is to account for an adversary with basic background 
information about the user (e.g., e-mail address, username, birthday). It may not 
always be realistic to assume that the adversary has no background knowledge 
because the adversary can often easily obtain some background knowledge about a 
user by searching for publicly available information on the internet. One approach 
might be to design a rule R to specify different passwords for different users (e.g., the 
set of passwords that contain the username or birthday of the user). 

Open Questions. While we were able to prove several hardness results about finding 
the optimal password composition policy in the negative rules setting, it is possible 
that these hardness results could be circumvented by making mild (hopefully realis- 
tic) assumptions about the underlying password distribution or the rules ...,Rm- 
Are there efficient algorithms to optimize p (k, As) in the negative rules setting given 
realistic assumptions? It is also possible that mild realistic assumptions could be used 
to circumvent the impossibility result of Theorem |B.l[ and design a universal approx- 
imation algorithm. 
There are also several interesting technical questions that remain open: 

(1) Normalization model with negative rules: Can we efficiently c-approximate 
p{l,As*) for any constant c? Is there a sub-exponential algorithm (in to) to com- 
pute p(l,^s*)? 

(2) Ranking model with positive rules: Can we efficiently c-approximate p (fc, As* ) for 
some constant c when /c is a parameter? 

The future. There is a real need for a principled approach to optimizing password 
composition policies. We have taken a first step in this direction by providing an intu- 
itive theoretical model and showing that it leads to algorithms that perform well on 
real data. We can only hope that our work will spark a fundamentally new interaction 
between theory and practice in passwords research. 
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A. MISSING PROOFS 

Reminder of Theorem I3.5i There exists a constant c > 1 such that it is U GC-hard 
for a poly(rt, N, k)-time algorithm to c- approximate the optimal p{k, A) in the singleton 
rules setting and the rankings model. 

Proof of Theorem \3. 51 We begin with a construction of a b ounded degree grap h which 
is hard approximate up to a (say) 1.5-factor. As shown in [Aust rin et al.ll201 11. for ev- 
ery constant d there exists a family of rf-regular graphs for which it is J7GC-hard to 
determine whether there exists a vertex cover of size t, or all vertex-covers have size 
at least (2 — 0(log log{d) / log{d)) — e) t. Fixing d to be a large enough constant such that 
this factor is > 1.5, we now reduce this family of instan ces t o a password problem using 
the exact same construction as in the proof of Theorem |3.4| with the exception that we 
set k = g + e — (1.5 — e)t. 

Observe, for this family of instances, e = 0(g) so \V\ — 0(g), but also the size of the 
optimal vertex-cover has to be 0(g) (at most g and at least g/d). Furthermore, each 
password appears at the top of at most d preference-lists. Therefore, by allowing A and 
banning B = V\A,we not only have a distribution whose support is of size but it 
also holds that the probability of each word in is f7(l/|Lg|). 

Therefore, if the graph has a vertex-cover C of size t, then by banning all words 
B = {wu : M G C} we have that the n preference-lists induce a distribution over 
\Lb\ > g + e - t. Since we set k = g + e - (1.5 - e)t we have that the set of most 
uncommon passwords contain at least (0.5 - e)t = fi(|Le|) words, each with Q(l/\Lrs\) 
probability, thus p(k,A) = 1 - fl(l). (And, in particular, for the optimal policy A* we 
hayep(k,A*) = 1-17(1).) 

In contrast, applying the same argument from the proof of Theorem 13. 4[ we have 
that if G has all vertex-covers of size > (1.5 — e)t then p(k, A) = 1. The 0(l)-hardness 
of approximation follows. □ 

Reminder of Theorem 13.61 Let e > 0. Unless P = NP there is no polynomial time 
algorithm (in N,n,m) that approximates min^cfm] p(l; -^s) to a factor of n}/^^'^ in the 
negative rules setting and the rankings model. 

Proof of Theorem \3.6\ Fix e > 0. Our reduction is from the Max-Indepen dent-Set prob - 
lem, which is known to be hard to approximate up to a factor of n^^'^ rHastad"1996^. 
We are given a graph G with g vertices and e edges, and we must determine whether 
the size of G's largest independent set is g^^"^ or g'^. 

Given a Max-Independent-Set instance, we denote K — g"^ and create the following 
password policy instance, which is composed out of the following set of possible words: 

V = {Ai,...,AK}U{Bi,...,Bg} 

\{u,v}€E{G) J 
\veV{G),l<i<j<K J 

We now describe the n = g + ge + g^C^]< g^ + ,g^+^'^ users' preference-lists. We start 
with the g rankings specified in Table |V(a)l We continue with ge more rankings, where 
for each edge (u,v) e E(G) we add g more rankings, as detailed in Table |V(b)l Lastly, 
we add 5^ (^) more rankings, where for each triple (v,i,j) where w is a vertex of G and 
i ^ j e [K] we add g rankings, as detailed in Table [V(c)l (Observe, the tables detail the 
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Table V: Rankings used in the proof of Theorem [3^ 



(a) First type. 









A, 




A, 


A2 




A2 




Ak 




Ak 


Bi 









(b) Second type. 







P 














X 




X 





(c) Third type. 







£ ■ ■ 


-^v.i,j. 1 






-^v.j,i.l 






X 




X 





first few words in each fist, then end with ". . ." mark, which indicates that from that 
point on the remaining words may appear in any order.) 

Finafiy, we detail our rules. For every i G [A'] and u e V{G) we have a rule which 
roughly corresponds to deciding that w is a member of the independent set: 



Ru,i = {A,}u u {ci„ci^,...,c:ju y 

{v. («,«)e-B(G)} ]elK].j=ii 
Our analysis now follows from a series of observations. 



1 ^u,i,j,g} ■ 



Observation 1: If we do not ban all of the passwords Ai, Ak thenpi > g/n. Therefore, 
for every i, we must choose at least one of the rules {Rua} to activate, or else we have 
that pi > g/n 



Observation 2: If we ban . . 



, then we must have pi > g/r 



Therefore, for any i ^ j it must not be the case that we ban 

(u, v) e E{G), or else we have that pi > g/n. 



and Ry , where 



Observation 3: If we ban D 



,D 



v,i-,3,g' 



and Dy 



Therefore, for any i ^ j it must not be the case that we ban Ru,.. 
we have that pi > g/n. 



,Dy^j_,„g then pi > g/r 



and Ru , , or else 



These observations lead us to the following conclusion. If G contains an independent 
set i^i, ...,vk of size K, then activating the rules i?u2,2, ■ • • ,Rvk,k} leads to a 

setting where each truncated ranking begins with a unique word, so pi = 1/n. In 
contrast, if G does not have an independent set of size K, then pi — g/n. Since n — 
0{g^) we have an ri(rt^/'^)-hardness of approximation. Observe also that the number 
of total words isN ^ K + g + 2eg + g'^K{K - 1) + 1 0{g'^) = 0{n) so it is also hard to 



□ 



approximate the problem to a factor oiVl{N'^/'^). 

Reminder of Theorem 14. ll For every k, Algorithm^ computes argmin_4p (fc, ^) in 
the singleton rules setting of the normalized probabilities model, in time 0{N \og{N)). 



Proof of Theorem \4.1\ Let A* denote the optimal solution, denote its most k popular 
passwords as , . . . , Wi^^ , and denote also P* as the total probability mass of the words 
in A* according to the initial distribution: P* = X)«,e^* Pr[w]. Therefore, p{k,A*) = 

Clearly, all words Wj s.t. j > ik belong to A* - otherwise, we could add such a word 
and decrease the probability of the top k words. Similarly, all words Wj s.t j < ii must 
not belong to A*, otherwise they would belong to the set of most popular k words. We 



now claim that Wi 



are k consecutive words. 
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Suppose that there was some word w' between some w^^ and Wi^+i- Then A* clearly 
banned it, otherwise it would be one of the most popular k words. We claim that the 
policy A' where we ban w^^ and allow w' instead satisfies p(fc, A') < p{k, A*). 

We denote pi = Pr[i(;iJ, q = Ej=2 v' — Pr[?i''], and we know pi > p'. Then 

■p[k,A') = (pi + q)/P*, whereas 



p{k,A!)= 



P* -Pi+ p' 

Our goal is to show p{k,A') <p{k,A*), which holds iff 

{p +q)P* <{pi+q){P* -{Pl-P )) 
By some algebraic manipulations, this holds iff 

{P1-P)P* >{pi-p'){pi+q) 
which clearly holds because pi — is a non-negative quantity, and pi + q = 

As for the running time of the algorithm, it is obvious that sorting requires 
0{N log N) time. Finding the minimum requires only 0{N) time: if we denote 

Yl,i<j<i+k P^[^i] ^"^^ — J2t<j Pi'I^il. then based on and b., it is easy to compute a,;+i 
and in 0(1) time. □ 

Reminder of Claim Fix m and s such that m > s. There exists a domain D of size 
6(s^ log(m)) and a family ofni sets, Fi,F2, . . . , F„i C D, such that each set in the family 
contains ^ elements, and for every C C [m] of size \C\ < s, we have that the size of 

the union |UiGC - ''sr^- ^his domain can be constructed in randomized poly(s, m) 
time. 

Proof of Claim Given m and s, we first pick a random function : [m] — > [2s]. 
Fixing a subset C C [m] of size |C| < s, we claim that \4>{C)\ > \C\/2 w.p. at least 
1 - (0.825)l'^l. Indeed, 

Pr [\(I){C)\ < \C\/2] < Pr [3T C [2s] s.t. |T| = |C|/2 and Vi G C, G T] 

= el^l/2(^My ^(V^)""< (0.825)1^1. 

So assuming |C| > 8 we have that C is mapped to at least |C|/2 distinct images by cf) 
w.p.> 3/4. Also, if |C| < 7 then probability of even two elements getting mapped to the 
same image is at most (2) < 0.25 for s > 42. 

We now construct D by taking d independently chosen such (/(-mappings, which 
we denote as (f>i,4>2, ■ ■ ■ ■,4>d, and so I? = [2s] x [d]. We construct the family Fi = 
{(0i(i), 1), ((?!)2(i), 2), . . . , d)} for every i e [m]. Clearly, for every i it holds that 

|Fi I = d ^ \D\/2s. Supposed for the sake of contradiction that there exists some C C [m] 

of size < s such that llJiec -^d < ■^l^d- By construction, we have that 



so by the Markov inequality we have that at least d/2 functions where the cardinality 
of the image of C is less than |C|/2. Let Xc.j be the indicator random variable of (fij 
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mapping the set C to no more than |C|/2 distinct elements, the Hoeffding bound gives 
that 



Pr 



3C of size < s s.t. Y.j ^Cj > d/2\ <^\A Pr[i ^Xcj > 0.5] < 



™O(.)g-d/10 



3 



Setting d = 9(s log to) gives that w.p. > 1/2 no such C exists. □ 

Reminder of Theorem I4.4i Unless P = NP there is no polynomial time algorithm 
(in N,m,n) which outputs argmin5c[m] P (fc, -^s) in the positive rules setting and the 
normalization model. 

Proof of Theorem \4.4\ Our reduction is from set cover. 
-Se^ Cover Instance: Sets 5*1, ... , Sm, Universe U = {I, . . . ,n} and integer k. 
Question: Is there a set cover of size fc — 1? 
Now we define Wi, . . . , W„ to be n disjoint sets of passwords 

= {w^^e\l <£< n^m^} . 

We also define special passwords tj (j < to) and Tj (j < k) which are not contained in 
any Wi. 

We define the following positive password rules: 

= {U} U {tj 1 1 < J < A:} + U Wj . 

We assign probabilities as follows: 

Pr [w,j] = (1 ^ 



for each i < n and i < m^n^. Observe that 



Pr 



1 

1 T 



so that almost all of the probability mass is concentrated inside the sets Wi and the 
probability mass is uniformly distributed. We also set 

1 - X 



and 

X 



I L lib 

where < a; < 1 will be defined later. First notice that 



j^k j<ni 



1 — a; \ / X 

- m 



rfik 



SO our probability distribution is well defined. Suppose that there is a set cover C c [to] 
s.t. |C| < fc — 1 A UiGC = ^"^^ consider the solution Ac- We cover all W^'s and use 
at most fc — 1 f s. Hence, 



p{k,Ac) < ((fc-l)Pr[i]+Pr[T]) 



n -^ - 1 
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Suppose that there is no set cover of size k. For every set of k or more rules S we have 
at least k t's in our solution so 

p{k,As)>kPr[t] . 

For every set of rules S that does not cover all the Wis we have at most 
- ^) -fraction of the total probability mass so 

It suffices to select x s.t. 

or — after some algebraic manipulation — equivalently, 

(7^) (fc-2) + ^ 

^ ^ Pr[r] < Pr[t] < 6-Pr[T]> ^ 



Observe that a < Pr[r] < 6 so it suffices to set x s.t. Pr[i] = We can solve for x to 
get 



m (^-3 + 2n + 2n^ - 2n^ + k {n - if (l + n + n^)) 



m (-3 + 2n + 2n^ - 2n^) + k^ (2 - 2n - + n^) + fc + 4n + - 2^* + m (n - 1)^ (1 + n + n2)^ 

□ 

Reminder of Claim Pr[3i,B^A] < S . 

Proof of Claim IKS] By the union bound it suffices to show that 

Pr IB ADA < — . 

m 

Our first step is to divide the passwords w e P into buckets Bj based on their proba- 
bility. For j > we define 

B, = [w\^ <Pt[w\AsA < ^} , 

and for j = we set 

Bo = {w\€< Pi[w\As^]} ■ 

Observe that 



Let w e Bj be given (j > 0) then by the Chernoff Bounds: 



e 



m 



Pr [s„, > s Pr [u; I As.,] + se/2] < exp i^~2^~^ log ^— j j < 
Notice that the bucket Bj contains at most \Bj\ = 2^ /e passwords. 

Pr [3w £ Bj,s^ > sPr [w I As^ + se/2] < < -— 

TO 2^ + ^? 
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Now if we union bound across all j > we get 

oo 

Pr 3w e [J Bj,Sn] > sPr[w\Asi] + se/2 

3 = i 



EU (J 

i=i 



Finally, we consider the passwords in Bq. By Chernoff Bounds for each w e Bgwe have 

Se 



Pr[|s^, -sPr[w|^sJ| >se/2] < 
by applying the union bound |i?o| < 1/e we get 



2m ' 



2m 



Pr [3w e Bq |s„ - s Pr [w | yt^J | > se/2] < 
Combining our inequalities we obtain the desired result: 

oo 

Pr [BADi] < Pr 3w £ [JBj, > s Pr [w \As^ + se/2 



< 



m 



□ 



B. IMPOSSIBILITY OF CONSTANT-FACTOR UNIVERSAL APPROXIMATION 

In this section we consider the following goal: given a constant c find a password com- 
position policy A such that 

p{k,A) < c-p{k,A') , 

for any other policy A' and every value o{ k < N. Such a policy — if it exists — would 
provide a nearl y optimal def ense against both online attacks and dictionary attacks 
simultaneously l Seelevl[l989 ll. Unfortunately, Theorem IB. II rules out the possibility of 
a constant universal approximation in the rankings model. Our impossibility result 
holds even in the singleton rules setting. We show that it is possible to construct a 
distribution V over rankings for which no universal approximation exists. 

We construct our distribution V ( algorithm [BJl over rankings by merging two distri- 
butions Vi and I?2 over preference lists. 

Intuition: Passwords sampled from D2 are highly secure, but passwords sampled 
from Di are highly insecure. To make improve the security of Di it is necessary to ban 
all passwords in W, but this reduces the security of D2 significantly. 

We make two claims (1) We must ban all but a small subset of passwords if we want 
to even approximately optimize p (1, A). (2) We must keep a larger subset of passwords 
to even approximately optimize p [k, A) for large values of k. 

Theorem B.l. For all constants c > there exists distribution V over rankings 
such that \fA C V, 3A\ fc £ N, such that 



pik,A) > c-p{k,A') 
Proof, (sketch) LetV = WUX where W = UI-1 W,, 



■ ,Wi^t} — and 



X — {xi, . . . ,xl} are two disjoint sets of passwords, where the parameters are set as 
follows q = j^, t = L = log N and r — ^^-f^ ■ Our distribution over preference lists is 
given by algorithm [H 

There are two cases to consider: 

Case 1: 3x e W - A then it is easy to see that 

P(l,^)>| = y = f >2cxp(l,X) . 
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Algorithm 6 Sample V 



Input: 

Parameters L,r,q,t 
Random Number u e [0, 1]. 

Random Permutation tt^ of for each i e {1, r} 
Random Permutation pix over X 
Random Permutation n-p of V 
Initialize: £ ^ empty ranking 

ifu<q then > Select from Vi 

for i = 1 ^ r do 

i (i, TTio'^ ) > Append random permutation of 

i {£,TTx) > Append random permutation of X 

else t> Select from P2 

£ TT-p 

return £ 

Case 2: Suppose that Vx e we have x A and consider k = L with the solution V 
- don't ban any passwords. For the solution V we have 

q , 1-9 

Pi = 7 



t \x\ + \w\ 

for i <t (e.g., for the t the passwords in Wi), and 

1-9 



I^I + IW^I 



for i > t. 



1-9 \ , 1-9 



1 / 1\ L 



2 V 27 L + lO'' 
< l=p{k,A) 



□ 



C. EXPERIMENT RULES 



We selected rules based on common types of rules used in constr ucting password com- 
position policies, e.g., the policies recommended by NIST HBurr et al. 2006]. The rules 
we selected are shown in Table IVll Positive and negative forms of each rule are shown. 
In the positive rules setting, a password is allowed if it matches any positive rule. In 
the negative rules setting, a password is banned if it matches any negative rule. 

The dictionary check used the cracking dictionary from openwall.com This dic- 
tionary is used by one of the most well-known password crackers, John the Rip- 
per [Designer 2010]. Since this dictionary contains all alphabetic strings up to size 
3, it was pruned to only include entries of 4 characters or more for the "contains a 
dictionary word" dictionary check. 
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Table VT: Rules Used in Sampling Experiments 







Details 


O Llldl dL LfcJl o UI liiUI c 


J_icoo Llidii O Llldl dLl/tJl D 


Length rules 


cJ Llldl dL LfcJl o UI liiUI c 


Xjcoo Llidii 17 Clldl dCljfcJl D 


III ri o /*^"0"**c? r\y* TV1 /^fo 
J.U LlidI dCLcI t) UI IllUlt; 


J_icoo Llidii ±\J LlidI dLLcI to 


J.± Ulidl dCLcI b UI IllOlfcJ 


J_ictoto Llidii ±1. LiidI dt,LcI to 


12 cJi3.r3.ctGrs or mor© 


l-iGss tlian 12 cliaracters 


J. (J uiidi cHjLci a ui iiiLii t; 


J_iCOO Llidii J-tJ l,lidldULClto 


14 characters or more 


Less than 14 characters 


J.O LlidldCLcIt) UI IllOlt; 


i-icoto Llidii J-tJ LlidldCLcIto 


16 characters or more 


Less than 16 characters 


1 Hi en 'i' rw m nvo 
J. LllglL XJl lllLll t; 


J-iCOO Llidii J. Lll^lL 


Character class rules 


1 symbol or more 


Less than 1 symbol 


1 lowercase or more 


Less than 1 lowercase 


1 11 ntiPTTJi c;a nv mni'p 

_L LI L/ LJCJ. LjdoC \JL LlLyJL C 


[ ,pcc thfin 1 11 rinPT*p?i cip 

XJC^aa Llldll X Lt.LJL/d UdOO 


2 digits or more 


Less than 2 digits 


2 symbols or more 


Less than 2 symbols 


2 lowercase or more 


Less than 2 lowercase 


2 uppercase or more 


Less than 2 uppercase 


In a dictionary 


Not in a dictionary 


Dictionary checks 


Contains a dictionary word 


Does not contain a dictionary 
word 


8 characters or more AND 1 up- 
percase or more 


Less than 8 characters OR less 
than 1 uppercase 


Combination Rules 


8 characters or more AJNU 1 up- 
percase or more AND 1 digit or 

more 


Less than 8 characters UK less 
than 1 uppercase OR less than 
1 digit 



Notice that for some groups of rules, e.g., length rules, digit rules, etc., the subsets 
defined by these rules are subsets or supersets of each other. For example, if the posi- 
tive rule "8 characters or more" is in a policy, adding the "10 characters or more" rule 
yields the same policy. We did this to prevent the selection of overly complex policies, 
e.g., "8 characters" OR "11 characters" OR "12 characters" OR "14 characters." How- 
ever, we also selected a couple of "combination rules" to make policies more interesting. 
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